(3.4.4) Adaptive boosting

In the field of machine learning, there is related technology. It is called adaptive boosting (AdaBoost).

In machine learning, we create a program called "classifier" that accepts inputs and returns classification results.

For example, let's think about how humans solve the problem: "look at the picture and answer whether it is a mammal or a fish." The human receives the input photo, and think whether it is a mammal or a fish, and returns the classification result. The classifiers do the same thing.

Adaptive boosting is a method to make a classifier of higher accuracy by collecting weak classifiers with lower accuracy. The key concept of this method is to emphasize the wrongly answered problem. Let's follow the flow of learning.

https://gyazo.com/4a914e86d14225d3f64c632c7930b577

Fig: Adaptive boosting

The one with A and B in the upper left is the correct answer. We use weak classifiers that can only cut the space vertically or horizontally. This classifier has only weak judgment abilities such as "If it is on the top line, it is A" or "If it is on the first column, it is A."

It is a difficult problem for a single weak classifier because A and B are in a staircase pattern. How weak classifiers solve the problem.

First, a weak classifier ❶ learn the correct answer. And then we test whether the classifier can answer correctly. Suppose that the classifier cuts vertically as in answer 1 in the test. The center top answer is incorrect because the correct answer A and the answer of the classifier is B. (correctness check 1 in the figure)

Next, we put aside the weak classifier ❶ who answered the first test and train a new weak classifier ❷. At this time, we make it focus on the wrongly answered question. This time we emphasize that the middle upper part is A. In the second test, suppose the weak classifier ❷ answer that it cut vertically as in answer 2.

We combine the answers of the weak classifier ❶ and the weak classifier ❷ by the majority vote (previous answer summary 2).

In this case, the opinion of the two classifiers is diverted for the center column. So we make these two answer incorrect (correctness check 2).

Finally, we emphasize these two incorrect answers and train a new weak classifier ❸.

By the fact that the middle upper part is A, the lower middle part is B, suppose the weak classifier ❸ cut them horizontally like answer 3.

We combine the answers of three weak classifiers by a majority vote (the answer summary 3 so far), you can see that the committee of three weak classifiers correctly answered all the questions (correctness check 3).

In this way, by focusing on the wrongly answered question and repeating learning and testing, it is possible to understand complicated things even with weak classifiers. Humans are the same. If a person who did not study much is said "please classify mammals and fish," they may think that "if it lives in the water, it is a fish." The condition corresponds to the weak classifier ❶. It is roughly correct, but there are some wrong answers. For example, whales live in the water, but they are mammals.

Focusing on this wrong answer, we train a new weak classifier ❷. For example, "a creature larger than 5 meters in length is a mammal?" Tuna is 3 meters at most, and the blue whale is 20 meters to 34 meters, so it looks good. However, there are also wrong answers. Although whale sharks are fish, they are as much as 20 meters. As a weak classifier ❸, we add "If the tail fin is vertical, it is fish." I think those classifiers correctly classify the fish and the mammal. However the classification of animals is complicated, there may be still wrong answers. (*20) By paying attention to such wrong answers and learning how to distinguish them, the rate of correct answers rises rapidly by the majority vote of weak classifiers.

---

Footnote *20: Life forms are diverse. Mugwort is fish which walks on land with breast billet. Lungfish mainly breathe with lungs.

en.icon